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Abstract. With more than four billion usage of cellular phones worldwide, mo- 
bile advertising has become an attractive alternative to online advertisements. 
In this paper, we propose a new targeted advertising policy for Wireless Ser- 
vice Providers (WSPs) via SMS or MMS- namely AdCell. In our model, a WSP 
charges the advertisers for showing their ads. Each advertiser has a valuation for 
specific types of customers in various times and locations and has a limit on the 
maximum available budget. Each query is in the form of time and location and is 
associated with one individual customer. In order to achieve a non-intrusive de- 
livery, only a limited number of ads can be sent to each customer. Recently, new 
services have been introduced that offer location-based advertising over cellular 
network that fit in our model (e.g., ShopAlerts by AT&T) . 
We consider both online and offline version of the AdCell problem and develop 
approximation algorithms with constant competitive ratio. For the online version, 
we assume that the appearances of the queries follow a stochastic distribution 
and thus consider a Bayesian setting. Furthermore, queries may come from dif- 
ferent distributions on different times. This model generalizes several previous 
advertising models such as online secretary problem 1101 . online bipartite match- 
ing II 3171 and AdWords 1 1 81 . Since our problem generalizes the well-known sec- 
retary problem, no non-trivial approximation can be guaranteed in the online set- 
ting without stochastic assumptions. We propose an online algorithm that is sim- 
ple, intuitive and easily implementable in practice. It is based on pre-computing 
a fractional solution for the expected scenario and relies on a novel use of dy- 
namic programming to compute the conditional expectations. We give tight lower 
bounds on the approximability of some variants of the problem as well. In the of- 
fline setting, where full-information is available, we achieve near-optimal bounds, 
matching the integrality gap of the considered linear program. We believe that our 
proposed solutions can be used for other advertising settings where personalized 
advertisement is critical. 
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1 Introduction 

In this paper, we propose a new mobile advertising concept called Adcell. More than 
4 billion cellular phones are in use world-wide, and with the increasing popularity of 
smart phones, mobile advertising holds the prospect of significant growth in the near fu- 
ture. Some research firms JT] estimate mobile advertisements to reach a business worth 
over 10 billion US dollars by 2012. Given the built-in advertisement solutions from 
popular smart phone OSes, such as iAds for Apple's iOS, mobile advertising market is 
poised with even faster growth. 

In the mobile advertising ecosystem, wireless service providers (WSPs) render the 
physical delivery infrastructure, but so far WSPs have been more or less left out from 
profiting via mobile advertising because of several challenges. First, unlike web, search, 
application, and game providers, WSPs typically do not have users' application context, 
which makes it difficult to provide targeted advertisements. Deep Packet Inspection 
(DPI) techniques that examine packet traces in order to understand application context, 
is often not an option because of privacy and legislation issues (i.e., Federal Wiretap 
Act). Therefore, a targeted advertising solution for WSPs need to utilize only the infor- 
mation it is allowed to collect by government and by customers via opt-in mechanisms. 
Second, without the luxury of application context, targeted ads from WSPs require non- 
intrusive delivery methods. While users are familiar with other ad forms such as ban- 
ner, search, in-application, and in-game, push ads with no application context (e.g., via 
SMS) can be intrusive and annoying if not done carefully. The number and frequency 
of ads both need to be well-controlled. Third, targeted ads from WSPs should be well 
personalized such that the users have incentive to read the advertisements and take pur- 
chasing actions, especially given the requirement that the number of ads that can be 
shown to a customer is limited. 

In this paper, we propose a new mobile targeted advertising strategy, AdCell, for 
WSPs that deals with the above challenges. It takes advantage of the detailed real-time 
location information of users. Location can be tracked upon users' consent. This is al- 
ready being done in some services offered by WSPs, such as Sprint's Family Location 
and AT&T's Family Map, thus there is no associated privacy or legal complications. To 
locate a cellular phone, it must emit a roaming signal to contact some nearby antenna 
tower, but the process does not require an active call. GSM localization is then done by 
multi-lateratior|l based on the signal strength to nearby antenna masts El . Location- 
based advertisement is not completely new. Foursquare mobile application allows users 
to explicitly "check in" at places such as bars and restaurants, and the shops can ad- 
vertise accordingly. Similarly there are also automatic proximity-based advertisements 
using GPS or bluetooth. For example, some GPS models from Garmin display ads for 
the nearby business based on the GPS locations 1231 . Shop Alerts by AT&TQis another 
application along the same line. On the advertiser side, popular stores such as Starbucks 
are reported to have attracted significant footfalls via mobile coupons. 



3 The process of locating an object by accurately computing the time difference of arrival of a 
signal emitted from that object to three or more receivers. 

4 http://shopalerts.att.com/sho/att/index.html 
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Most of the existing mobile advertising models are On-Demand, however, AdCell 
sends the ads via SMS, MMS, or similar methods without any prior notice. Thus to deal 
with the non-intrusive delivery challenge, we propose user subscription to advertising 
services that deliver only a fixed number of ads per month to its subscribers (as it is the 
case in AT&T ShopAlerts). The constraint of delivering limited number of ads to each 
customer adds the main algorithmic challenge in the AdCell model (details in Section 
1 1 _ lb - In order to overcome the incentive challenge, the WSP can "pay" users to read 
ads and purchase based on them through a reward program in the form of credit for 
monthly wireless bill. To begin with, both customers and advertisers should sign-up for 
the AdCell-service provided by the WSP (e.g., currently there are 9 chain-companies 
participating in ShopAlerts). Customers enrolled for the service should sign an agree- 
ment that their location information will be tracked; but solely for the advertisement 
purpose. Advertisers (e.g., stores) provide their advertisements and a maximum charge- 
able budget to the WSP. The WSP selects proper ads (these, for example, may depend 
on time and distance of a customer from a store) and sends them (via SMS) to the cus- 
tomers. The WSP charges the advertisers for showing their ads and also for successful 
ads. An ad is deemed successful if a customer visits the advertised store. Depending on 
the service plan, customers are entitled to receive different number of advertisements 
per month. Several logistics need to be employed to improve AdCell experience and 
enthuse customers into participation. We provide more details about these logistics in 
the full paper. 

1.1 AdCell Model & Problem Formulation 

In the AdCell model, advertisers bid for individual customers based on their location 
and time. The triple (k, £, t) where k is a customer, £ is a neighborhood (location) and 
t is a time forms a query and there is a bid amount (possibly zero) associated with 
each query for each advertiser. This definition of query allows advertisers to customize 
their bids based on customers, neighborhoods and time. We assume a customer can 
only be in one neighborhood at any particular time and thus at any time t and for each 
customer k, the queries (k,£i,t) and (k,£2,t) are mutually exclusive, for all distinct 
li, ?2- Neighborhoods are places of interest such as shopping malls, airports, etc. We 
assume that queries are generated at certain times (e.g., every half hour) and only if a 
customer stays within a neighborhood for a specified minimum amount of time. The 
formal problem definition of AdCell Allocation is as follows: 

AdCell Allocation There are m advertisers, n queries and s customers. Advertiser i 
has a total budget bi and bids Uij for each query j. Furthermore, for each customer k £ 
[s], let Sk denote the queries corresponding to customer k and Ck denote the maximum 
number of ads which can be sent to customer k. The capacity Ck is associated with 
customer k and is dictated by the AdCell plan the customer has signed up for. Advertiser 
i pays if his advertisement is shown for query j and if his budget is not exceeded. 
That is, if Xij is an indicator variable set to 1, when advertisement for advertiser i is 
shown on query j, then advertiser i pays a total amount o/min(^ j - XijUij, bi). The 
goal of AdCell Allocation is to specify an advertisement allocation plan such that the 
total payment mm (2j x ij u iji bi) is maximized. 
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The AdCell problem is a generalization of the budgeted AdWords allocation prob- 
lem H4I211 with capacity constraint on each customer and thus is NP-hard. Along with 
the offline version of the problem, we also consider its online version where queries 
arrive online and a decision to assign a query to an advertiser has to be done right away. 
With arbitrary queries/bids and optimizing for the worst case, one cannot obtain any 
approximation algorithm with ratio better than -. This follows from the observation 
that online AdCell problem also generalizes the secretary problem for which no deter- 
ministic or randomized online algorithm can get approximation ratio better than ^ in 
the worst casejfl Therefore, we consider a stochastic setting. 

For the online AdCell problem, we assume that each query j arrives with proba- 
bility pj. Upon arrival, each query has to be either allocated or discarded right away. 
We note that each query encodes a customer id, a location id and a time stamp. Also 
associated with each query, there is a probability, and a vector consisting of the bids for 
all advertisers for that query. Furthermore, we assume that all queries with different ar- 
rival times or from different customers are independent, however queries from the same 
customer with the same arrival time are mutually exclusive (i.e., a customer cannot be 
in multiple locations at the same time). 



1.2 Our Results and Techniques 

Here we provide a summary of our results and techniques. We consider both the offline 
and online version of the problem. In the offline version, we assume that we know ex- 
actly which queries arrive. In the online version, we only know the arrival probabilities 
of queries (i.e., pi,-- - ,p m ). 

We can write the AdCell problem as the following random integer program in which 
Ij is the indicator random variable which is 1 if query j arrives and otherwise: 

maximize. ^ min(^ X# iiy , b { ) (IPbc) 

i 3 

Vje[n]: ^X,.. • 1, (/ ) 

i 

Vfce H : ^^x„ o, (C) 

jest i 

Xye{o,i} 

We will refer to the variant of the problem explained above as IPbc- We also consider 
variants in which there are either budget constraints or capacity constraints but not both. 
We refer to these variants as IPg and IPc respectively. The above integer program can 
be relaxed to obtain a linear program LPbc, where we maximize J^j ~^-ij u ij with 
the constraints (F), (C) and additional budget constraint Xjjity < bi which we 
refer to by (B). We relax Xy 6 {0, 1} to £ [0, 1]. We also refer to the variant of 

5 The reduction of the secretary problem to AdCell problem is as follows: consider a single 
advertiser with large enough budget and a single customer with a capacity of 1. The queries 
correspond to secretaries and the bids correspond to the values of the secretaries. So we can 
only allocate one query to the advertiser. 
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this linear program with only either constraints of type (B) or constraints of type (C) 
as LPb and LPc- 

In the offline version, for all i 6 [m] and j 6 [n], the values of lj are precisely 
known. For the online version, we assume to know the E[Ij] in advance and we learn 
the actual value of L, online. We note a crucial difference between our model and the 
i.i.d model. In i.i.d model the probability of the arrival of a query is independent of the 
time, i.e., queries arrive from the same distribution on each time. However, in AdCell 
model a query encodes time (in addition to location and customer id), hence we may 
have a different distribution on each time. This implies a prophet inequality setting in 
which on each time, an onlooker has to decide according to a given value where this 
value may come from a different distribution on different times (e.g. see II 1 41 1 II ). 

A summary of our results are shown in ITable II In the online version, we compare 
the expected revenue of our solution with the expected revenue of the optimal offline 
algorithm. We should emphasis that we make no assumptions about bid to budget ratios 
(e.g., bids could be as large as budgets). In the offline version, our result matches the 
known bounds on the integrality gap. 

We now briefly describe our main techniques. 

Breaking into smaller sub-problems that can be optimally solved using conditional 
expectation. Theoretically, ignoring the computational issues, any online stochastic op- 
timization problem can be solved optimally using conditional expectation as follows: At 
any time a decision needs to be made, compute the total expected objective conditioned 
on each possible decision, then chose the one with the highest total expectation. These 
conditional expectations can be computed by backward induction, possibly using a dy- 
namic program. However for most problems, including the AdCell problem, the size of 
this dynamic program is exponential which makes it impractical. We avoid this issue by 
using a randomized strategy to break the problem into smaller subproblems such that 
each subproblem can be solved by a quadratic dynamic program. 

Using an LP to analyze the performance of an optimal online algorithm against 
an optimal offline fractional solution. Note that we compare the expected objective 
value of our algorithm against the expected objective value of the optimal offline frac- 
tional solution. Therefore for each subproblem, even though we use an optimal online 
algorithm, we still need to compare its expected objective value against the expected 
objective value of the optimal offline solution for that subproblem. Basically, we need 
to compare the expected objective of an stochastic online algorithm, which works by 



Offline Version 


Online Version 


- A | -approximation algorithm. 

- A -approximation algorithm when 

Vi maxj Uij < ebi. 


- A (~ — ^-approximation algorithm. 

- A (l — - ) -approximation algorithm with 
only budget constraints. 

- A | -approximation algorithm with only 
capacity constraints. 



Table 1. Summary of Our Results 
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maximizing conditional expectation at each step, against the expected objective value 
of its optimal offline solution. To do this, we create a minimization linear program that 
encodes the dynamic program and whose optimal objective is the minimum ratio of the 
expected objective value of the online algorithm to the expected objective value of the 
optimal offline solution. We then prove a lower bound of -| on the objective value of 
this linear program by constructing a feasible solution for its dual obtaining an objective 
value of i . 

Rounding method of [20 | and handling hard capacities. Handling "hard capacities", 
those that cannot be violated, is generally tricky in various settings including facility 
location and many covering problems [5 8 19]. The AdCell problem is a generalization 
of the budgeted AdWords allocation problem with hard capacities on queries involving 
each customer. Our essential idea is to iteratively round the fractional LP solution to 
an integral one based on the current LP structure. The algorithm uses the rounding 
technique of l20l and is significantly harder than its uncapacitated version. 

Due to the interest of the space we differ the omitted proofs to the full paper. 

2 Related Work 

Online advertising alongside search results is a multi-billion dollar business 1151 and is 
a major source of revenue for search engines like Google, Yahoo and Bing. A related 
ad allocation problem is the AdWords assignment problem lfl8l that was motivated by 
sponsored search auctions. When modeled as an online bipartite assignment problem, 
each edge has a weight, and there is a budget on each advertiser representing the upper 
bound on the total weight of edges that might be assigned to it. In the offline setting, 
this problem is NP-Hard, and several approximations have been proposed [3 2 4.21]. 
For the online setting, it is typical to assume that edge weights (i.e., bids) are much 
smaller than the budgets, in which case there exists a (1 — 1/e) -competitive online 
algorithm [18|. Recently, Devanur and Hayes |6] improved the competitive ratio to 
(1 — e) in the stochastic case where the sequence of arrivals is a random permutation. 

Another related problem is the online bipartite matching problem which is intro- 
duced by Karp, Vazirani, and Vazirani |13l . They proved that a simple randomized on- 
line algorithm achieves a (1 — 1/ e) -competitive ratio and this factor is the best possible. 
Online bipartite matching has been considered under stochastic assumptions in 19171171 . 
where improvements over (1 — 1/e) approximation factor have been shown. The most 
recent of of them is the work of Manshadi et al. (T7\ that presents an online algorithm 
with a competitive ratio of 0.702. They also show that no online algorithm can achieve 
a competitive ratio better than 0.823. More recently, Mahdian et al. lfl6l and Mehta et 
al. lfPTI improved the competitive ratio to 0.696 for unknown distributions. 

3 Online Setting 

In this section, we present three online algorithms for the three variants of the problem 
mentioned in the pervious section (i.e., IPb, IPg an d IPbc)- 
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First, we present the following lemma which provides a means of computing an 
upper bound on the expected revenue of any algorithm (both online and offline) for the 
AdCell problem. 

Lemma 1 (Expectation Linear Program). Consider a general random linear pro- 
gram in which b is a vector of random variables: 

(Random LP) 

■ ■ T 

maximize. c x 

s.t. Ax < b; x > 

Let OPT{b) denote the optimal value of this program as a function of the random 
variables. Now consider the following linear program: 

(Expectation LP) 

■ ■ T 

maximize. c x 

s.t. Ax < E[b}; x>0 

We refer to this as the "Expectation Linear Program" corresponding to the "Random 
Linear Program". Let OPT denote the optimal value value of this program. Assuming 
that the original linear program is feasible for all possible draws of the random vari- 
ables, it always holds that E[OPT{b)] < OPT. 

Proof. Let x*(b) denote the optimal assignment as a function of b. Since the random LP 
is feasible for all realizations of b, we have Ax*(b) < b. Taking the expectation from 
both sides, we get AE[x*(b)] < E[b}. So, by setting x = E[x*(b)] we get a feasible 
solution for the expectation LP. Furthermore, the objective value resulting from this 
assignment is equal to the expected optimal value of the random LP. The optimal value 
of the expectation LP might however be higher so its optimal value is an upper bound 
on the expected optimal value of random LP. 

As we will see next, not only does the expectation LP provide an upper bound 
on the expected revenue, it also leads to a good approximate algorithm for the online 
allocation as we explain in the following online allocation algorithm. We adopt the 
notation of using an overline to denote the expectation linear program corresponding to 
a random linear program (e.g. LPbc f° r LPbc)- Next we present an online algorithm 
for the variant of the problem in which there are only budget constrains but not capacity 
constraints. 

Algorithm 1 (Stochastic Online Allocator for IPb) 

- Compute an optimal assignment for the corresponding expectation LP (i.e. LPs). 
Let x*j denote this assignment. Note that x*j might be a fractional assignment. 

- If query j arrives, for each i S [to] allocate the query to advertiser i with proba- 
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Theorem 1. The expected revenue of\T\is at least 1 — - of the optimal value of the 
expectation LP (i.e., LPb) which implies that the expected revenue of\l\it is at least 
1 — i of the expected revenue of the optimal offline allocation too. Note that this result 
holds even if Uij 's are not small compared to Furthermore, this result holds even 
if we relax the independence requirement in the original problem and require negative 
correlation instead. 

Note that allowing negative correlation instead of independence makes the above 
model much more general than it may seem at first. For example, suppose there is a 
query that may arrive at several different times but may only arrive at most once or only 
a limited number of times, we can model this by creating a new query for each possible 
instance of the original query. These new copies are however negatively correlated. We 
define the negative correlation as follows: 

Definition 1 (Negative Correlation). Let Xi, • • • , X„ be random variables. For any 
subset S C {1, • • • , n}, let X5 denote the subset of random variables indexed by S 
and let X$ and X' s denote two realization of these random variables. We say that 
Xi, • • • , X n are negatively correlated iff for any random variable Xj and any subset 
X5 of random variables (not containing X,J and any constant c, if Xs < X' s then 
Pr[X, < c|X s = X S ] < Pr[X < c|X s = X' s ]. 

Remark 1. It is worth mentioning that there is an integrality gap of 1 — - between the 
optimal value of the integral allocation and the optimal value of the expectation LP. So 
the lower bound of lTheorem H is tight. To see this, consider a single advertiser and n 
queries. Suppose pj = ^ and u\j = 1 for all j. The optimal value of LPs is 1 but even 
the expected optimal revenue of the offline optimal allocation is 1 — when n —> 00 
because with probability (1 — —) n no query arrives. 

To prove lTheorem II we use the following theorem: 

Theorem 2. Let C be an arbitrary positive number and let Xi, • ■ ■ , X„ be inde- 
pendent random variables (or negatively correlated) such that X^ G [0,(7]. Let 
(i = E[J2i X^. Then: 

£[min(£ 4 X ?; ,C)]>(l-^)C 

Furthermore, if fi < C then the right hand side is at least (1 — \)ti. 

Proof ( \Theorem~2b . Define the random variables = max(Ri_i — Xj, 0) and Ro = 
(7. Observe that for each i, R, = max((7 — J2]=i ^-j > 0) so mm (Sj=i -^7 1 C) + Ri = 
G. Therefore £?[minQ^. =1 Xj , (7)] +i£[Rj] = (7 and to prove the theorem it is enough 
to show that B[R„] < -^j^ ■ C. To show this we will prove the following inequality: 



R< < (1 - ^rWi (U) 

Assuming that ([U| is true, we can conclude the following which proves the claim. 
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Rn<c-n(i-™) 

i=l 

<c 1 



The last inequality follows from the fact that ^\ E ^'^ = ^ and the right hand side 
takes its maximum when for all i : M^J. = Ji, and n — > oo. Furthermore, to prove the 
second claim, we can use the fact that (1 — x a ) > (1 — x)a for any a < 1 and conclude 
that (1 — -^rjc) > (1 — = (1 — -j)^ whenever [i < C. Now it only remains to 

prove the inequality dUt : 



E[Ri] = £ , [max(R i _i - X,,0)] 

^[maxtRi-i-Xi?^, 0)] 

= E\Ri-i — % 1 ] 

= B[Rj_i] — — S[XiRi_i] 

Rj_i and Xj are either independent or positively correlated so: 

< £[Ri_i] - ispCijBtRi-i] 
= (1-^)^-1] 

That completes the proof. 

Now we prove ITheorem ll using the above theorem. 

Proof ATheorernTl . We applv lTheorem 2l to each advertiser i separately. From the per- 
spective of advertiser i, each query is allocated to her with probability x*j and by con- 
strain t (B) we can argue that have u = x ij Ui : < bi = C so u < C and by Theo- 
rem 2, the expected revenue from advertiser i is at least (1 — fJQTj x *ij u ij)- Therefore, 
overall, we achieve at least 1 — - of the optimal value of the expectation LP and that 
completes the proof. 

Next we present an online algorithm for the variant of the problem in which there 
are only capacity constrains but not budget constraints. 

Algorithm 2 (Stochastic Online Allocator for IPc) 

— Compute an optimal assignment for the corresponding expectation LP (i.e. LPc). 
Let x*a denote this assignment. Note that x* might be a fractional assignment. 
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— Partition the items to sets Ti, • ■ ■ ,T U in increasing order of their arrival time and 
such that all of the items in the same set have the same arrival time. 

— For each k E [s],t E [u],r E [ck], let Ef^ t denote the expected revenue of the 
algorithm from queries in Sk (i.e., associated with customer k) that arrive at or 
after T t and assuming that the remaining capacity of customer k is r. We formally 
define E r k t later. 

— If query j arrives then choose one of the advertisers at random with advertiser i 
chosen with a probability of -^f . Let k and T t be respectively the customer and 
the partition which query j belongs to. Also, let r be the remaining capacity of 
customer k (i.e. r is Ck minus the number of queries from customer k that have 
been allocated so far). Ifuij + — E k t+i then allocate query j to advertiser 
i otherwise discard query j. 

We can now define E£ t recursively as follows: 

E k,t = E E x *j max ( u ij + E k~tll' E k,t+l) 

+ (i-EE <i) E w ( Exp fc) 

j&T t ie[rr»] 

Also define E k t = and E r k u+1 = 0. Note that we can efficiently compute E r k t using 
dynamic programming. 

The main difference between Q] and [2] is that in the former whenever we choose an 
advertiser at random, we always allocate the query to that advertiser (assuming they 
have enough budget). However, in the latter, we run a dynamic program for each cus- 
tomer k and once an advertiser is picked at random, the query is allocated to this adver- 
tiser only if doing so increases the expected revenue associated with customer k. 

Theorem 3. The expected revenue of\2\is at least | of the optimal value of the expec- 
tation LP (i.e., LPq) which implies that the expected revenue of\2\it is at least ^ of the 
expected revenue of the optimal offline allocation for IPq too. 

Remark 2. The approximation ratio of |2]is tight. There is no online algorithm that can 
achieve in expectation better than | of the revenue of the optimal offline allocation 
without making further assumptions. We show this by providing a simple example. 
Consider an advertiser with a large enough budget and a single customer with a capacity 
of 1 and two queries. The queries arrive independently with probabilities p± = 1 — e and 
P2 = e with the first query having an earlier arrival time. The advertiser has submitted 
the bids &u = 1 and 612 = . Observe that no online algorithm can get a revenue 
better than ( 1 — e) x 1 + e 2 ^—^ w 1 in expectation because at the time query 1 arrives, the 
online algorithm does not know whether or not the second query is going to arrive and 
the expected revenue from the second query is just 1 — e. However, the optimal offline 
solution would allocate the second query if it arrives and otherwise would allocate the 
first query so its revenue is £-^r + (1 — e) 2 x 1 « 2 in expectation. 

Next, we show that an algorithm similar to the previous one can be used when there 
are both budget constraints and capacity constraints. 
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Algorithm 3 (Stochastic Online Allocator for IPbc) 

Run the same algorithm as in\2\except that now x*j is a fractional solution of LPbc 
instead of LPq- 

Theorem 4. The expected revenue of\3\ is at least \ ~ \ of the optimal value of the 
expectation LP (i.e., LPbc) which implies that the expected revenue of\3\it is at least 
4 — - of the expected revenue of the optimal offline allocation too. 

Before we prove the last two theorems, we define a simple stochastic knapsack 
problem which will be used as a building block in the proof of lTheorem 31 

Definition 2 (Stochastic Uniform Knapsack). There is a knapsack of capacity C and 
a sequence of n possible items. Each item j is of size 1, has a value of Vj and arrives 
with probability Pj. Let L, denote the indicator random variable indicating the arrival 
of item j. We assume that items can be partitioned into sets T\, ■ ■ ■ ,T U based on their 
arrival times such that all the items in the same partition have the same arrival time 
and are mutually exclusive (i.e. at most one of them arrives) and items from different 
partitions are independent. Furthermore, we assume that X^g[n] Pj — ^- 

The following algorithm based on conditional expectation computes the optimal 
online allocation for this problem: 

Algorithm 4 (Stochastic Uniform Knapsack - Optimal Online Alloca- 
tor) 



Consider a stochastic uniform knapsack problem as defined in Definition 2 



— For each t G [u] and r G [C], let El denote the expected revenue of the algorithm 
from queries that arrive at or after time t (i.e. T t , • ■ • , T u ) and assuming that the 
remaining capacity of the knapsack is r. We formally define E 7 t later. 

- If item j arrives do the following. Let t be the index of the partition which j belongs 
to and let r be the remaining capacity of the knapsack. Put item j in the knapsack 
ifv J+ E r t -l>El +l . 



El can be defined recursively as follows and can be efficiently computed using 
dynamic programming: 



El = Y, V, max( V , + E^ 1 ,E r t+1 ) + (l-J2 PjWt+i (EXP) 
Also define E® — and £^ +1 = 0. 

Clearly the above algorithm achieves the best revenue that any online algorithm 
can achieve in expectation for the stochastic uniform knapsack. However, we need a 
stronger result since we need to compare its revenue against the optimal value of the 
expectation LP. 



12 S. Alaei, M.T. Hajiaghayi, V. Liaghat, D. Pei and B. Saha 

Lemma 2. Consider the stochastic uniform knapsack problem as defined in Defini- 
tion 2. Let denote the random variable representing the expected revenue of^\for 
this problem (i.e. = E±). Also define O e = 2~2jPj v j- Assuming that 2~2jPj ^ C> 
the following always holds: 



-O e < E[0 ] < O e 

Proof ( \Lemma 2l ). The upper bound is trivial. Clearly, no algorithm (offline or online) 
can get more than pjVj revenue in expectation from each item j. So the total expected 
revenue is upper bounded by O e = 2~2j PjVj. Next we prove the lower bound. 

To prove the lower bound we first narrow down the instances that would give the 
smallest E @ . The plan of the proof is as follows. First, we show that for each t if 
we replace all the items arriving at time t (i.e. all items in set T t ) with a single item 
with probability p t = 2~2jeT t Pj an( ^ va l ue l 't = 2~2jeT t v j^> we ma y on ^ decrease 
E[0 ] but O e does not change. So this replacement may only decrease E £°> and after 
the replacement, each partition only contains one item. So, WLOG, we only need to 
prove the lower bound for instances in which each partition contains one item. Next, 
we argue that if we scale all vfs by a constant, both E[0 ] and O e are scaled by the 
same constant. So, WLOG, we assume that O e = 1. Therefore, we only need to prove 
a lower bound on the following program: 



minimize. S[0 D ] 
s.t. P > 1 



We then consider a linear relaxation of the above program and prove a lower bound 
of J on this relaxation which also implies a lower bound of h for the original program. 
We prove this by constructing a feasible solution for the dual of this linear program that 
achieve a value of ^ . 

In what follows, we explain each step of the proof in more detail: 
First of all, we claim that if we replace all of the items arriving at time t (i.e., all items 
in T t ) with a single item with probability p t = 2~2jeT t Pj an< ^ va l° e v t = 2~2jeT t v j ^7' 
then O may decrease but O e is not affected. Let E[0' o ] and 0' e respectively denote 
the result of making this replacement. The fact that O e is not affected is trivial because 
v tPt = 2~2jeT t Pj v j so = Oe- Let E' r t denote the expectation after replacing all 
items in T t * with a single item as explained. For all values of t > t* nothing is affected 
so E' r k = E r k . Consider what happens at t = t* when we make the replacement: 



From (lEXPb we have: 



El = Pj max( Wj + E r t -l + (1 - £ P^L+i 

jeT t jeT t 
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for any convex function /(•) and nonnegative cti's with ^ a, = 1 it always holds that 
a if( x i) > f(J2i a i x i) an d max(a; + a, b) is a convex function of x so: 

E r t = Pt ~ max (^ + E l+lEl +l ) + (1 - Pt )^, t+ i 
jer t Pt 

> ft max( J2 ^ + E r t +lEl +1 ) + (1 - p t )E r h>t+1 
jer t Pt 

= Pt max(« t + E r t -l,E r t+1 ) + (1 - p t )E r h>t+1 
= E' r t 

So we proved that E'\ < E r t for t — t*. Furthermore, notice that according to 
equation (lEXPb . for each t, E r t _ x is an increasing function of E[ and E r t ~ x so if E\ 
decreases then E r t _ x may only decrease so for all values of t < t* we can argue that 
E'l < E r t and in particular E[0' o ] = E'° < E% = E[0 }. That means the replace- 
ment may only decrease the expected revenue of our algorithm. So if we replace all the 
items in each T t with a single item as explained above one by one we get an instance in 
which each partition only contains one item and with a possibly lower expected revenue 
from our algorithm. Therefore, WLOG, it is enough to prove a lower bound for the case 
where each partition contains one item. 

Since scaling all v/s by a constant scales both E[0 ] and O e by the same constant, 
we can scale all v/s so that O e = 1. So, WLOG, we only need to prove the lower 
bound for cases where O e = 1. Now, we argue that the optimal value of the following 
program gives a lower bound on i?[0 D ]. Therefore, we only need to prove the optimal 
value of this program is bonded below by | . 



minimize 
s.t. 

We now rewrite the the previous program as the following linear program with 
variables El and v t (with t 6 [u] and r G [C] by using the definition of El from 
(1EXPI) . Note that E[0 ] = Ey. Also note that, in the following, we address each item 
by the index of the partition to which it belongs. 



E[0 ] 

O e > 1 



minimize. E\ 

Vf € [u - l],Vr 6 [C] : El > p t (v t + E r t ^) + (l-p t )El +1 

Vte[tt-l],Vre[C]: E r t >E r t+1 

Vr e [C] : E r u > Pu v u 

vt > 0, El > 

Notice that any feasible assignment for the original program is also a feasible as- 
signment for the above program but not vice versa. So the above program is a linear 
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relaxation of the original program and therefore its optimal value is a lower bound for 
the optimal value of the original program. 

The above linear program is still not quite easy to analyze, so we consider a looser 
relaxation as we explain next. First, it is not hard to show that El as defined in (IEXPI) 
has decreasing marginal value in r which implies E^ 1 > ^E r t (This can be proved 
by induction on t with the base case being t = u and then proving for smaller i's. We 
will prove this formally later). Combining this with the definition of E r t from (lEXPb . 
we get the following inequality: 



El = pt max( I1( + El~lEl +l ) + (1 - Pt )El +1 
= max(p t v t + PtK+i + (1 - Pt)E r t+1 , E r t+1 ) 

> maxCptK + LZ-E% +1 ) + (l-p t )E r t+11 El +1 ) 
= nmx( Pt v t + (l-^)El +1 ,El +1 ) 



Next, we can write the following more relaxed linear program with only variables 
Ep and v t (with t € [u]): 



minimize. E P 

V* € [u - 1] : Ep - ptv t - (1 - j)El +1 > (a t ) 

- p u v u > (a u ) 

Vt e [u - 1] : E° - E? +1 > (/3 t ) 

U 

^2lhv t > 1 (7) 

t=l 

v t > 0, Ep > 



Next, we show that the optimal value of the above program is bounded below by \ 
which implies that the optimal value of the original program is also bounded below by 
i and that completes the proof. To do this, we present a feasible assignment for the dual 
program that obtains an objective value of at least \. Note that the objective value of 
any feasible assignment for the dual program gives a lower bound on the optimal value 
of the primal program. The following is the dual program: 



It is not hard to show that any optimal assignment of this linear program is also a feasible 
optimal assignment for the original program so the optimal value of the linear program and the 
original program are in fact equal. 
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7 

jp t ~ OLtPt < (V t ) 

at + Pi < 1 (^f) 
a t +/3t-(l-^)at-i-Pt-i<0 (E?) 

- (1 - ^K-i - < iE°) 
oc t >0, (3 t > 0, 7 > 

Now, suppose we set all a t = 7 and /3 t = /3 t _i — ^^-7 for all t except Pi = 

1 — 7. From this assignment, we get p t = 1 — 7 — 7^=1 fr. Observe that we get 
a feasible solution as long as all /3 t 's resulting from this assignment are non-negative. 

Furthermore, it is easy to see that p t > 1 — 7 — 7 ^ fc g, 1 Pk = 1 — 2j. Therefore, for 
7 = h, all p t 's are non-negative and we always get a feasible solution for the dual with 
an objective value of | which completes the main proof. Next, we present the proof of 
our earlier claim that E\~ x > ?—±E[. 

We now prove that E\ > -^p^El +1 by induction on t with the base case being t = u 
which is trivially true because E r u = p u v u for all r > 1. Next we assume that our claim 
holds for t + 1 and all values of r. We then prove it for t and all values of r as follows: 



maximize. 

Vf e [u] : 

Vi G [2 • ■ • u - 1] : 



= ft max(« t + El-l,El +x ) + (1 - p t )£[ +1 
= max(ft(w t + E r t ~l) + (1 - p t )^[+i^[+i) 

Observe thatmax(a, b) > max((l — e)a + eb, b) for all e £ [0, 1] so: 

El > max((l - e)[ Pt (v t + + (1 - Pt )E r t+1 ] 

+ eEl +1 ,El +l ) 

= max((l - e)p t (v t + E^ + y^^'+i) 
+ (l- Pt )E r t+l ,El +1 ) 

Now by applying the induction hypothesis on ££jlf and E' t+1 and setting e = -^j: 
E r t > ra a x(-^j Pt [v t + + -E r t+1 ] 

= maxfofo + + (1 - Pt )El+lEl+l) 
r + 1 4 

So we proved that > ^jEl +1 and that completes the proof. 
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Now we can prove the main two theorems using ILemma 2l 

Proof ATheoreniH . We apply ILemma 2 1 to the subset of queries associated with each 
customer k (i.e. Sk) separately. We may think of this as having a knapsack of capacity 
Cfc for customer k. Each pair of advertiser/query, (i, j) is a knapsack item with value ity. 
All knapsack items of the form with the same j are mutually exclusive (because 
at most one advertiser is chosen at random) and they all have the same arrival time. 
Therefore, by applving lLemma 2] from the knapsack of each customer k we get at least 
h(J2jes k Si x tj u ij) m expectation. So overall, we get | of the optimal value of the 
expectation LP and that completes the proof. 

Proof (\Theorem~4l . The proof is essentially the same as the proof of ITheorem 31 The 
only difference is that we may also lose at most a factor of - from each advertiser due 
to going over the budget limit. Note that this is a gross overestimation because using 
conditional expectation on each customer may result in discarding some of the queries 
which would make it less likely for advertisers to hit their budget limit. So overall, we 
get at least i — - of the optimal value of the expectation LP. 

4 Offline Setting 

In the offline setting, we explicitly know all the queries, that is all the customers, loca- 
tions, items triplets on which advertisers put their bids. We want to obtain an allocation 
of advertisers to queries such that the total payment obtained from all the advertisers is 
maximized. Each advertiser pays an amount equal to the minimum of his budget and 
the total bid value on all the queries assigned to him. Since, the problem is NP-Hard, 
we can only obtain an approximation algorithm achieving revenue close to the optimal. 
The fractional optimal solution of LPbc (with explicit values for Xj,j £ [n]) acts as 
an upper bound on the optimal revenue. We round the fractional optimal solution to a 
nearby integer solution and establish the following bound. 

Theorem 5. Given a fractional optimal solution for LPbc, we can obtain an integral 
solution for AdCell with budget and capacity constraints that obtains at least a profit of 

4— maxi z *™ ax 

j 1 — of the profit obtained by optimal fractional allocation and maintains all 

the capacity constraints exactly. 

We note that this approximation ratio is best possible using the considered LP re- 
laxation due to an integrality gap example from |4|. The problem considered in J4) is 
an uncapacitated version of the AdCell problem, that is there is no capacity constraint 
(C) on the customers. Capacity constraint restricts how many queries/advertisements 
can be assigned to each customer. We can represent all the queries associated with each 
customer as a set; these sets are therefore disjoint and has integer hard capacities asso- 
ciated with them. Our approximation ratio matches the best known bound from 141211 
for the uncapacitated case. In this section, we give a high-level description of the algo- 
rithm. We present the detailed description and proof in the next section. Our algorithm 
is based on applying the rounding technique of 11201 through several iterations. The es- 
sential idea of the proposed rounding is to apply a procedure called Rand-move to the 
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variables of a suitably chosen subset of constraints from the original linear program. 
These sub-system must be underdetermined to ensure that the rounding proceeds with- 
out violating any constraint and at least one variable becomes integral. The trick lies 
on choosing a proper sub-system at each step of rounding, which again depends on a 
detailed case analysis of the LP structure. 

Let y* denote the LP optimal solution. We begin by simplifying the assignment 
given by y*. Consider a bipartite graph G(B,I, E*) with advertisers B on one side, 
queries I on the other side and add an edge between a advertiser i and query j, if 
y*j G (0, 1). That is, define E* = 1 > y*j > 0}. Our first claim is that y* can 

be modified without affecting the optimal fractional value and the constraints such that 
G(B, X, E*) is a forest. The proof follows from Claim 2. 1 of J4); we additionally show 
that such assumption of forest structure maintains the capacity constraints. 

Lemma 3. Bipartite graph G = {B,X, E*) induced by the edges E* can be converted 
to a forest maintaining the optimal objective function value. 

Proof. Consider the graph G = (B,X, E*) and consider one connected component of 
it. We will argue for each component separately and similarly. 

Cycle Breaking: Suppose there is a cycle in the chosen component. Since G is 
bipartite, the cycle has even length. Let the cycle be C = (ii,ji,i2, Hi ■ ■ ■ 3uii)< 
that is consider the cycle to start from a advertiser node. Consider a strictly positive 
value a and consider the following update of the y* values over the edges in the cycle 
C. We add z 0) & to edge (a, b), where 

Rl. z iujl = -/3 

R2. If we are at an query node j t , t G [1, 1], then Zj t ,i t+1 = —Zi t j t 
R3. If we are at a advertiser node i t , t G [1, 1], then Zi t j t - 



f3 is chosen such that after the update, all the variables lie in [0, 1] and at least 
one variable gets rounded to or 1, thus the cycle is broken. Note that the entire up- 
date is a function of Zf x j x . For any query node, its total contribution in (Assign) con- 
straint of LP1 remains unchanged. For any advertiser node, except ix, its contribution 
in (Advertiser) constraint and thus in the objective function remains the same. In ad- 
dition, since the assign constraints remain unaffected, all the capacity constraints are 
satisfied. For advertiser i\, its contribution decreases by Zi l! j 1 bi 1 j 1 and increases by 

%ii .zi . ii — %ii . -j'i ®i 



• b h-i-h 



If bi 1 j 1 < bi 1 j l b 2 ' 1 b 3 ' 2 — b ' — 2 , then instead of adding Zj,^ on the last edge, we 

add some c < Zj u \ x such that Zi 1 j 1 bi 1 j 1 = cbi lt j r Thus, we are able to maintain the 
objective function exactly. The assign constraint on the last query ji can only decrease 
by this change and hence all the capacity constraints are maintained as well. 

Otherwise, h, > h, ,-, ,' 2 J1 ,' 3 ' J2 — . n ~ 1,J '~ 2 . In that case, we traverse the cycle in 

ljji i.Ji 6» 2 ,i 2 6» 3 ,i 3 -..0-i j _p 3 j_ 1 

the reverse order, that is, we start by decreasing on Zi 1; j t first and proceed similarly. 

Once, we have such a forest structure, several cases arise and depending on the 
cases, we define a suitable sub-system on which to apply the rounding technique. There 
are three major cases. 
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(i) There is a tree with two leaf advertiser nodes: in that case, we show that ap- 
plying our rounding technique only diminishes the objective function by little and all 
constraints are maintained. 

(ii) No tree contains two leaf advertisers, but there is a tree that contains one leaf 
advertiser: we start with a leaf advertiser and construct a path spanning several trees 
such that we either end up with a combined path with advertisers on both side or a 
query node in one side such that the capacity constraint on the set containing that query 
is not met with equality (non-tight constraint). This is the most nontrivial case and a 
detailed discussion is given in the next section. 

(iii) No tree contains any leaf advertiser nodes: in that case we again form a com- 
bined path spanning several trees such that the queries on two ends of the combined 
path come from sets with non-tight capacity constraints. 

5 The Detailed Description and Proofs of the Offline Algorithm 
5.1 Generic Rounding Scheme 

Rounding Scheme of ^201/ Suppose we are given a set of linear constraints Ax < b, 
where A is a m X n real matrix, x € [0, 1]™ and b G M. m . We are also given an 
optimal fractional solution x 6 [0, 1]" that optimizes a particular objective function 
say, "maxc T x", c € W 1 . Our goal is to round the variables in x to {0, 1}" such that 
the value of the objective function remains close to the initial fractional optimal and the 
constraints Ax < b are maintained to the extent possible. 

Project x to only those components x' with values in (0, 1). Suppose x' £ (0, 1)™. 
The components, x \ x', which are already rounded have their values fixed. Denote the 
reduced system by A'x' < b' , where A' is now am x n' real matrix, x' £ [0, 1]™ and 
b' G K m . Consider only the tightly satisfied linearly independent constraints from the 
system A'x' < b'. That is, these constraints are satisfied with equality and are linearly 
independent. Suppose, these subset of constraints are Ax' = b. We compute a r € M. n , 
r 0", such that A'r = 0, if such a r exists. We know that if the system Ax' = b is 
underdetermined, that is, have more variables than equations, then the nullspace of A is 
non-empty and thus computing a nontrivial r is easy. Once, such a r is computed, we 
consider the following two possible updates: 

Rand-Move: 

Update x„e W = x + ar with probability and; 
x n " ew = x — f3r with probability . 

Here a and f3 are two nonzero reals such that Xnew £ [0, 1]" ■ At least one compo- 
nent after update gets rounded to or 1, or one more constraint from A' \ A becomes 
tight. It is easy to verify that such a and (3 always exist. Note that E [i„* cffi ] = x (PI). 

If the system A'r = does not have any nontrivial solution, then we choose suitable 
constraints to drop from A' and make the system underdetermined. 

The process continues until all the variables are rounded and is guaranteed to termi- 
nate in polynomial time. 
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5.2 Rounding Algorithm 

Let y* denote the LP optimal solution. We begin by simplifying the assignment given by 
y*. Consider a bipartite graph G(B, X, E*) with advertisers B on one side, queries I on 
the other side and add an edge (i, j) between a advertiser i and query j, if y* ^ G (0, 1). 
That is, define E* = {(£, j)\ 1 > y* j > 0}. By Lemma [3] we know that y* can be 
modified without affecting the value of LPOpt such that G(B, 2, E*) is a forest. 

We now have a collection of trees. There can arise several cases at this stage. For 
each of these cases, we identify a set of linear constraints and apply our Rand-Move 
step on the variables in the chosen system of linear constraints. We now specify each 
of these cases and the system of linear constraints associated with that case. For Rand- 
Move to be applicable, we show that our chosen linear system is underdetermined. For 
the correctness proof, we show that all the assign and capacity constraints are main- 
tained. Some advertiser constraints may get violated, but in the objective an advertiser i 
can pay at most Bi. We show indeed the loss in the objective is at most j of the optimal 
objective value. Thus, we obtain a | -approximation. 

Let y denote the LP solution at this stage. There are three main cases to consider: 

Case (i). There is a tree with two leaf advertiser nodes. 

Case (ii). No tree contains two leaf advertisers, but there is a tree that contains one 
leaf advertiser. 

Case (iii). No tree contains any leaf advertiser nodes. 

Case (i). There is a tree with two leaf advertiser nodes. Consider the unique path P con- 
necting the two leaf advertisers say io and i/. Suppose P = (io,ji, ii,js,i2 : ■ ■ ■ ,ji, ii- 
Define a x variable for each edge in the path P that takes values in [0,1]. Consider the 
following system of linear constraints, 

Xi t _ x , it + x Hdt = yu-uk + Vh,jt v * e [1, 1] (5.1) 

X H,jt^it,jt + x it,jt+l^it,jt + l = 

ViuhKdt + VH,h+iK,h+i Vi e M - !] ( 5 - 2 ) 

x G [0, l] 2 ' (5.3) 

We apply Rand-Move on the above linear system. 

Lemma 4. The linear system defined by Eauations \5.1\ and \5.2\ is underdetermined, As- 
sign constraints for all queries, Capacity constraints for all sets and Bidder constraints 
for all advertisers except the two leaf advertisers are maintained. 

Proof. The number of constraints of type 15. H is I and the number of constraints of type 
I5.2l is l — l. However the number of variables is 21. Constraint ^ . 1 l ensures all the assign 
constraints and hence all the capacity constraints are maintained. Constraint l5 . 2 l ensures 
all the advertisers maintain their budget except probably the two leaf advertisers. 

Case (ii). No tree contains two leaf advertisers, but there is a tree that contains one leaf 
advertiser. There are several subcases under it. We first consider four simple subcases. 
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Subcase (1): There is a maximal path between two queries, where the two queries 
belong to the same set and the set-capacity constraint is non-tight. 

Since the path is maximal, the queries at the start and the end of the path are 
leaf queries and therefore have non-tight assign constraints. Non-tight naturally im- 
plies the fact that a constraint is not satisfied by equality. Suppose the maximal path is 
P = (ji, ii, . . . , and let the value of the edge-variables associated with this 

path be (y ilA , y ilth , y t2 j 2 , . . . , y ll _ 1 , ll _ 1 , Vi^uh)- These y values are treated as con- 
stants. Define variables (x^ ,j 1 ,Xi 1 ,j 2 , x i 2 ,j 2 , ■ ■ ■ , x ii-i,ji-n x H-i,ji) associated with 
these edges of P. Let S be the set containing the queries j\ and j\. Let the capacity of 
S be c. In the current solution, considering the rounded variables as well, let the total 
allocation of queries from the set S be be s + yi 1 j 1 + yi l _ 1 ,j l ■ That is, s is the sum of 
values of the queries in S other than ji and j\. Consider the following system of linear 
constraints: 



— *-i x ii-i,ji — 1 



x i* . 7+ bj. 4 4 ~\- X 



Uit-ljt + Hit, it 



x e [0, l 



l + l 



< S — C 



(5.4) 

Vfe[2,J-l] (5.5) 

Vt€ [1,1-1] (5.6) 
(5.7) 
(5.8) 



We apply Rand-Move on the above linear system. 



Lemma 5. The linear system defined for Subcase 1 under Case ( ii) is underdetermined 
and Rand-Move on it maintains all the constraints, Assign, Bidder, Capacity, ofLP-1. 

Proof. Note that, Constraint ( 15.7b is non-tight. In addition, Constraint ( 15.4b implies that 
the leaf queries have non-tight assignment constraint. Now, the number of variables 
associated with the above linear-system is 2(1 — 1) = 21 — 2 and the number of tightly 
satisfied linearly independent constraints are 21 — 3. Hence, we can employ Rand-Move. 

Constraint (15.5b implies the assignment constraint of the non-leaf queries are main- 
tained. Constraint ( 15.6b implies the budget constraint of the non-leaf advertisers, and 
therefore all the advertisers considered by this system, are maintained. The capacities 
of all the sets in which non-leaf queries participates are automatically maintained. In 
addition, Constraint ( 15.7b implies the capacity constraint of the set involving the leaf 
queries are maintained as well. 



Subcase (2): There is a maximal path between two queries, where the two queries 
belong to two different sets and both set-capacity constraints are non-tight. 

This is almost similar to Case (ii). Since the path is maximal, the queries at the start 
and the end of the path are leaf queries and therefore have non-tight assign constraints. 
Suppose the maximal path is P = (ji, i\, J2, *2, • ■ • , ji-i, k-i,ji) and let the value of 
the edge-variables associated with this path be {yt 1 j 1 , yi ly j 2 > Vizji > • • • > 
Vit-iJi-i'Vii-idt}- W e treat these values as constants here. Define variables 
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(xi 1 j 1 ,x il j 2 ,Xi 2 j 2 , . . . , Xi l _ 1 .j l _ 1 ,x ill j l ) associated with these edges of P. The set 
constraint involving the query j\ is non-tight and suppose the total sum of the values of 
the queries (rounded and not rounded) belonging to that set is s + yi 1 .j 1 , while its ca- 
pacity is c. Similarly, the set constraint involving the query ji is non-tight and suppose 
the total sum of the values of the queries (rounded and not rounded) belonging to that 
set is s' + Ui l _ 1: j l , while its capacity is c'. Consider the following linear system. 



x iujl < l,Xi t _ ujl < 1 (5.9) 
Xi t -uh + x u,h = Vh-uh + VitJt v * e [2, 1 - 1] (5.10) 



x i t .j t bi t j t + Xi ti j t+1 bi t j t+1 



Vh,j t K,j t + yi t ,h+iK,M+i v * e - l] (5.11) 

Xi u ji <c-s (5.12) 
<c'-s' (5.13) 



ie[0,l] l+1 (5.14) 

Note that changes in the linear system from Subcase 1 . We apply Rand-Move on 
the above linear system. 

Lemma 6. The linear system defined for Subcase 2 under Case ( ii) is underdetermined 
and Rand-Move on it maintains all the constraints, Assign, Bidder, Capacity, ofLP-1. 

Proof. The constraints (I5.12l i and (15.13b are non-tight and so are 15.91 The number of 
variables associated with the above linear-system is 2 (I — 1) = 2Z — 2 and the number of 
tightly satisfied linearly independent constraints are 21 — 3. Hence, we employ Rand- 
Move. 

Constraint (15.10b implies the assignment constraint of the non-leaf queries are main- 
tained. Constraint d5.1 U implies the budget constraint of the non-leaf advertisers, and 
therefore all the advertisers considered by this system, are maintained. The constraints 
( 15.121 1. ( 15 .13b ensure that we won't violate the capacity constraint of the sets involving 
the leaf queries j% and ji respectively. 



Subcase (3): There is a path ( not necessarily maximal path ) between two queries, where 
the two queries belong to the same set, the set-capacity constraint is tight but both the 
queries have non-tight assignment constraints. 

Suppose the path is P = (jx,ix,j2,i2,---,ji-i,ii-i,ji) and let the value of the 
edge- variables associated with this path be (y iujl , Vi 1 ,j 2 ,yi 2 ,j2 > • • • > Vh-i Ji-i: Vk-iji)- 
We treat these values as constants here. Define variables (xi 1 j 1 , x^j 2 , Xi 2 j 2 , . . . , 
a^j-i.jj-if^ii-u'i) associated with these edges of P. Let the total fractional assign- 
ment of query j\ be a\ + yi 1 j 1 < 1 and the total fractional assignment of query ji be 
i2 + < 1- Here we will apply the Cycle Breaking trick. We consider updates 

( z >Ul J Z il,j2 ' Z i2 J2 ' ■ • ■ 7 z il-l 1 z h-l,jl ) sucn that 

Rl. z jlM = -/3 

R2. If we are at an query node j t , t G [1, 1], then 

Z jt,it+l = ~ z it,jt 
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R3. If we are at a advertiser node i t , t G [1, 1], then 



'H >H—1 z Jt-l^t 



The value of f3 > is so chosen that ensures all the edge-variables remain in [0, 1], 
x ii-\.h < 1 — a 2, < 1 — oi. The entire update is a function of Zj x ^. If Zj lt i ± > 

Zj l ,i l _ 1 , then we apply the above update. Else, we consider the updates in the reverse 
direction, starting from the edge (ji, 

Lemma 7. The update vector z is nontrivial and the update maintains all the con- 
straints, Assign, Bidder, Capacity, ofLP-1. 

Proof. Clearly, all the advertiser nodes maintain their budget due to rule R3. All the 
query nodes, except j\ and ji maintain their assign constraint. All the sets that do not 
contain j\ or ji thus maintain their capacity constraints. We start the update, by sub- 
tracting from the edge (ji, ii) if Zj 1) i 1 > Zj u i l x . Therefore, the set that contains both 
ji and ji satisfy its capacity reduced. Otherwise, we start subtracting from the edge 
(jhil-it and again the set containing ji and ji maintains the capacity constraint, since 
now < 

Since, yi 1 .j 1 < 1 — oi, Ui l _ 1 .ji < 1 — «2 and all the other variables are in (0, 1), we 
can always find a j3 > such that either Xi 1 j 1 = 1 — a\ or Xi ll j l = 1 — a<x, or one of 
them is rounded down to 0, or some other variable in the path is rounded to or 1. 

Subcase (4): There is a maximal path with a advertiser on one side, an query in another 
with the set containing it being non-tight. 

Since we are considering a maximal path, the two end-points must be leaf nodes. 
Suppose the maximal path is P = (ji, «i, J2, *2, • • ■ ,jt-i,ii-i) and let the value of 
the edge-variables associated with this path be (yi 1 .j 1 , Ui 1 ,j 2 1 Vi2,hi ■ ■ ■ > Vk-i,ji-i)- Let 
the set in which the query j\ belongs be S and let it have a total assignment from the 
rounded and yet to be rounded variables equalling s + yi [1 ,j l _ 1 . In addition, let its 
capacity be c. Consider the following linear system: 




1,31 — 



< 1 



Vt e [2, 1 



1] 



(5.15) 
(5.16) 




(5.17) 
(5.18) 
(5.19) 



We apply Rand-Move on the above linear system. 



Lemma 8. The linear system defined for Subcase 4 under Case ( ii) is underdetermined 
and Rand-Move on it maintains the constraints, Assign, Capacity, of LP-1 as well as 
the Bidder constraint except possibly for the one leaf advertiser. 
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Proof. The constraints 15.151 and 15 . 1 8 l are non-tight. The number of tightly satisfied lin- 
ear independent constraints is therefore at most (I — 2) + (I — 2) = 21 — 4, whereas the 
number of variables is 21 — 3. Hence Rand-Move can be applied. 

Constraint 15.161 and 15 . 1 5 1 ensure that all the assign constraints for the queries are 
maintained. Constraint l5 . 1 7l ensures the advertiser constraints are maintained for all the 
advertisers except possibly for Constraint s . 1 6 I and l5 . 1 8 l ensure that all the capacity 
constraints are maintained. 

As long as Case (i) or (1-4) subcases of Case (ii) apply, we continue applying them. 
Also at any time, if we find the linear-system composed of all the tightly satisfied lin- 
early independent constraints of LP- 1 for any tree becomes underdetermined, we apply 
Rand-Move. When neither subcase (l)-(4) or Case (i) apply, or Rand-Move can not 
be applied to the whole system, we have the following properties of the resulting forest 
structure: 

1. (Case 1 does not apply): No two leaves are advertisers. So there can be at most one 
leaf that is a advertiser in any tree. 

2. (Subcase 3 does not apply): No two queries that are non-tight and belong to the 
same set with tight capacity are in the same tree. Therefore, each tree can contain 
only one non-tight query from a tight set. 

3. (Rand-Move does not apply to the LP1 constraints for any single tree): The number 
of tightly satisfied linearly independent constraints from each tree is at least as many 
as the number of variables. 

4. (Subcase 1 and 2 do not apply): No two leaves that are queries belong to the same 
set. Also among the leaves that are queries, at most one can belong to a set that 
has non-tight capacity constraint. In essential, there can be only one leaf that is an 
query and that belongs to a set that has non-tight capacity constraint. 

5. (Subcase 4 does not apply): If there is a leaf node that is a advertiser in a tree, all 
other leaf nodes must be queries and must be part of sets that have tight capacity 
constraint. 

Subcase (5): None of subcases (l)-(4) apply. 

This is the most nontrivial subcase. Denote the tree that contains a leaf advertiser 
node by T\ and let i\ be the advertiser that is a leaf. Consider a maximal path starting 
from i\. Since Case (i) or Subcases (1-4) do not apply, the other leaf end-point is an 
query, say jj 1 , that belongs to set Si and set S\ has tight capacity constraint. Of course, 
the query jj 1 has non-tight assign constraint since it is a leaf node. Let the path be as 
follows: 

Pi = (*l7.7l! *2j .72) • • ■ >*{i-l>iii-l)*ii>iii)- 

Since subcase 3 does not apply, tree T\ does not contain any other non-tight query 
from Si. Now capacities are always integer and set Si has tight capacity constraint. 
This implies that set Si must contain another non-tight query and that non-tight query 
must belong to a different tree. Denote this second tree by T2 and call this another non- 
tight query of Si by j\ . If T2 contains a leaf node that is a advertiser, consider the path 
from jf to that advertiser node. Say the path is, 

p _ / -2 -2 -2 -2 -2 -2 -2 -2 \ 

*2 — \Jl 7 Hi 32 > ■ ■ ■ ) l l 2 -2i Jl 2 -V l h-l' 3 1-2 > l l-2'- 
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Consider a combined path (Pi, P 2 ). 

{Pi ■• P2) = (^i)ii) • • • i ij-L-nij-L-ii i\ t , ji ± jii , i\, j\ , ■ ■ ■ ,ji 2 , if 2 )- 

Essentially this combined path is thought of a single path ending at two leaf ad- 
vertisers. We apply the rounding of Case (i) in this scenario with a slight change in 
handling the job nodes. We rewrite the linear system for convenience. 

x il-i,il + x i\,3\ = Vil-idi + Vit j t lV< S [1, /1 - 1] (5.20) 

x i't-xdi + x ^Ul = Vii-^i + wL? v * e [2 ' ^ (5 ' 21) 
< < 1 (5.22) 

x i\ Ji, + ^ Vi\ ,4 + ( 5 - 23 ) 

V(t,o) £ ([2,Zi], 1) U ([1,Z 2 - 1],2) (5.24) 
x e [0, i] 2i i+ 2 '2-2 (5.25) 

We apply Rand-Move as usual. Note that, essentially we are assuming jf and jf 
as a single node while writing the constraint l5.23l 

Lemma 9. The linear system defined above in underdetermined and Assign constraints 
for all queries, advertiser constraints for all advertisers except i\ and if and Capacity 
constraints for all sets are maintained. 

Proof. Again the number of linearly independent tightly satisfied constraints are (l\ — 
1)+)(Z 2 - 1) + 1 + {h - 1) + (h - 1) = 2Zi + 2Z 2 - 3 from l5^0ll53ni533l andl5^4l The 
number of variables is 21 1 + 2I2 — 2. Thus Rand-Move can be applied. From constraints 
15.201 [5T2T1 15.221 we get that all the assign constraints and all the capacity constraints 
except for set S are satisfied. Constraint 15 .23 l ensures that the capacity constraint of the 
set 5 is maintained. Constraint 15 . 241 maintains all the advertiser constraints except for 
advertisers i\ and if 2 . 

When, the above does not apply, then in T2 there is no leaf node that is a advertiser. 
If there is a leaf node that is an query but the query is in a set that has non-tight capacity 
constraint, then we consider that path P 2 ( sav ) ( we use the same symbols as in P2 for 
P 2 , but it is not to be confused with P2, since we are considering P 2 when no such path 
like P2 exists). 

P'l = (iii^ij 3ii • • • ) h 2 i3i 2 )- 

Consider a combined path (Pi , P 2 ) as before, that is we treat jf and jf as a single 
node while maintaining their total contribution to the set S. Note because of considering 
the combined path (Pi, P 2 ), this becomes identical to the subcase 4. So we apply the 
rounding on this combined path as in subcase 4. The correctness of this rounding step 
also follows from Lemma|8] 
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Otherwise, all the leaf nodes in T2 are queries and the sets containing them have 
tight capacity constraint. Follow a maximal path from j\ to one such leaf node, say jf, 
and let it belong to set S2. Denote the maximal path by P' 2 '. 

Since subcase 3 does not apply to T-x, T2 does not contain another non-tight query 
from Si. But, the capacity of S2 is integer and thus it must have another non-tight 
query. Call that query to be jf and denote the tree containing it to be T3. If T3 happens 
to be same as T\, then consider the path P' in T\ between jf and jf , Now consider 
the combined path (P' , P^). In this combined path the two end-points belong to two 
non-tight queries from set S2 that has tight capacity constraint. Thus, this is identical to 
subcase 3 and we apply the rounding of subcase 3. The correctness follows again from 
Lemma|7] 

Otherwise, T3 is a tree different from both T\ and T2 and we continue similarly 
from jf. Thus, if at any point of time, we reach a leaf node that is a advertiser or an 
query in a non-tight set, or an query in a tight-set but for which the another non-tight 
query belongs to a tree already visited, we can continue our rounding. 

However, it may happen that a tight set contains more than two non-tight queries. In 
that case, it is possible to visit a tight set more than twice in our process. So suppose we 
are at tree T g and while considering maximal path, Pi = (jf, i\ , jf , . . . , i 9 _ 1 , jf ), 
we get to jf that belongs to a set S 9 that is already visited. That is, we have al- 
ready seen two non-tight queries as end-points (one at the end of a maximal path 
and the other as the start of a maximal path in two consecutive trees) of two maximal 
paths say in Th and Th+i, h + 1 < g. Let the maximal paths that have been consid- 
ered in trees Th+i,Th+2, ■ ■ ■ , T g be Ph+i, Ph+2, ■ ■ ■ , Pg- Consider the combined path 
(Ph+i, Ph+2, ■ ■ ■ , Pg) and note that in this combined path the two end-points belong to 
two non-tight queries from set S g that has tight capacity constraint. Thus we apply the 
rounding of subcase 4. Indeed it is not required to visit a non-tight query for the third 
time as an end-point of a maximal path. If at any time in this process, we visit a third 
non-tight query from a set with tight capacity constraint, we can write a combined path 
with two end-points containing non-tight queries from that set and apply rounding of 
subcase 3. 

Otherwise, all the trees visited are different and we keep on continuing this pro- 
cess. Since the number of trees are at most min {n, m}, this process must terminate 
in some tree T* and at some leaf query node jf within a tight set St. Since St has at 
least two non-tight queries, the other non-tight query, say j, must belong to some tree 
T* ,t' < t. Considering a path from j to jf ( and then following the maximal paths in 

T* +1 , T* +2 , . . . , T*, we again get a combined path on which we can apply rounding 
of subcase 3. 

Case (iii). No tree contains any leaf advertiser nodes. This case is similar to Case (ii). 
We start with a leaf query, possibly with a leaf query that is in a non-tight set if one 
exists, and obtain a combined path on which we can apply one of Subcases (l)-(4). 

This completes the description of the rounding method. At every step, the entire 
rounding procedure takes polyin, m) time and at each step we either make a constraint 
tight or round a variable. Thus we are guaranteed to complete rounding all the variables 
to integers in polynomial number of steps. 
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From the above discussion and Lemma[4]|9] we get the following, 



Lemma 10. The rounding procedure maintains all the assign and the capacity con- 
straints. A advertiser node maintains the advertiser constraint as long as in the current 
fractional solution, it is connected to two or more queries with nonzero fractional val- 
ues. 

Now, we need to prove that our expected approximation ratio is 



max; 



Bi 



We can always assume bi, max < Bi without loss of gener- 



where 6 ijTOC 

ality for all i, we get a 3/4 approximation. If bids are small, that is max; 
then we get a (4 — e)/4 approximation. 



< e, 



Proof (Theorem^. 

Let Pf denote the payment made by advertiser i as assigned by LP1 . In our rounding 
process, when an edge-variable gets rounded to or 1, it is removed permanently or 
assigned permanently. The forest structure that we consider always contains only the 
fractional edge-variables. If the advertiser i never has degree 1 in the forest, then by our 
rounding procedure its final payment is same as P9. Therefore, suppose at some stage 
s, advertiser i becomes a leaf node and let a be the so far rounded payment on i and let 
b be the unique query assigned to advertiser i with fractional assignment p and bid d. 
Note that, all a, b,p, d are random variables. If P? denote the total payment (fractional 
and integral) done by advertiser i at the end of the sth iteration, then we have 



Pf 



dp = P? 



ps+1 ps+2 



Once a advertiser becomes a leaf node, it only takes part in Rand-Move. Let 
, P\ denote the payment rounded on advertiser i at the end of the it- 
erations s + 1, s + 2, . . . , t. Assume t is the last iteration. Then we have from property 
[PI] of Rand-Move that 



E P^Pr 1 =a + dp 



3-1 



dp 



s-i 



for g > s. Thus 

f a + dxPr \P- 



if IP 9 " 1 =a + dx 



5-1 



dx 



Hence we have 



E 



P 



Pr 

9-1 



P'r 1 =a + dx 



e [pf\ = e [pr] 



E [Pi] 



dp = P* 



Then it directly follows from the above, 

With probability 1—p the rounded payment on advertiser i is a and with probability 
p the rounded payment is a + d, since E [Pf] = aPr [edge (i, b) is rounded to 0] + (a + 
d)Pr [edge (i, b) is rounded to 1]. 
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Thus the final expected profit from advertiser i is (1 — p)mm{Bi,a} + 
p mm {Bi, a + d}. The profit obtained from i in the optimal LP solution is 
min {Bi, a + dp}. Therefore, by the linearity of expectation, the expected approxima- 
tion ratio is the maximum possible value of 

(1 — p) min {B{, a} + pmm {Bi, a + d} 
min {Bi, a + dp} 

This part of the proof is similar to the analysis of Theorem 1 of 1211 . Let bi tJnax — 
maxj bij. We can assume without loss of generality that bi, max < Bi for all i. It is easy 
to see that if a > Bi or a + d < Bi, then the above approximation ratio is 1. Hence 
assume, a < Bi < a + d. We thus have the approximation ratio to be 

a(l —p) +pB, t 

r = ; 

min {Bi, a + dp} 

Now considering the two cases, Bi < / > a + dp, we get the following result: 

(1 - p) mm {Bi, a} + pmm {Bi, a + d} < 4 - max, hj ^f- 
min {Bi, a + dp} 4 

Since we can assume without loss of generality bi, max < Bi for all i, we get a 
3/4 approximation. If bids are small, that is max,; i- g — < e, then we get a (4 — e)/4 
approximation. 
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